Parsing with Intraclausal Coordination and Clause Detection
نویسنده
چکیده
Syntactic analysis, i.e., parsing of text is used during various tasks, e.g., machine translation, question answering, etc. The structure of a sentence is represented with a tree. Parsing long sentences is a difficult task. The motivation was to analyze sub-units of the sentence independently, which could improve the overall parsing accuracy. We developed a new parsing algorithm that includes intraclausal coordination and clause detection. Parsing using clause detection was first tried by Abney (1), whose algorithm delimits non-embedded clauses before the complete parse is made. In (2), there is a short description of a rule-based parser where clause identification is included in the parsing process. A detailed description of our new algorithm can be found in (3). To our knowledge, the algorithm is the first to use intraclausal coordination detection in cojunction with clause detection before parsing. The most important contribution is the decrease in the number of parsing errors by 7.1% and 6.4% for Slovene, compared to the Malt (4) and MSTP (5) baseline parsers, respectively.
منابع مشابه
Intraclausal Coordination and Clause Detection as a Preprocessing Step to Dependency Parsing
The impact of clause and intraclausal coordination detection to dependency parsing of Slovene is examined. New methods based on machine learning and heuristic rules are proposed for clause and intraclausal coordination detection. They were included in a new dependency parsing algorithm, PACID. For evaluation, Slovene dependency treebank was used. At parsing, 6.4% and 9.2 % relative error reduct...
متن کاملParsing With Clause and Intra-clausal Coordination Detection
We present a new dependency parsing algorithm based on the decomposition of large sentences into smaller units such as clauses and intraclausal coordinations. For the identification of these units, new methods combining machine learning techniques and heuristic rules were developed. The algorithm was evaluated on the Slovene dependency treebank text corpus. Compared to the MSTP parser, currentl...
متن کاملCoordination Boundary Identification with Similarity and Replaceability
We propose a neural network model for coordination boundary detection. Our method relies on two common properties — similarity and replaceability in conjuncts — in order to detect both similar and dissimilar pairs of conjuncts. The model improves the identification of clause-level coordination using bidirectional recurrent neural networks incorporating two properties as features. We show that o...
متن کاملRelative clause attachment ambiguity resolution in Persian
The present study seeks to find the way Persian native speakers resolve relative clause attachment ambiguities in sentences containing a complex NP of the type NP of NP followed by a relative clause (RC). Previous off-line studies have found a preference for high attachment in the present study, an on-line technique was used to help identify the nature of this process. Persian speakers were pre...
متن کاملRelative Clause Ambiguity Resolution in L1 and L2: Are Processing Strategies Transferred?
This study aims at investigating whether Persian native speakers highly advanced in English as a second language (L2ers) can switch to optimal processing strategies in the languages they know and whether working memory capacity (WMC) plays a role in this respect. To this end, using a self-paced reading task, we examined the processing strategies 62 Persian speaking proficient L2ers used to read...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Informatica (Slovenia)
دوره 34 شماره
صفحات -
تاریخ انتشار 2010